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<110> Mclninch, James 

<120> COMPUTATIONAL NUCLE^IC ACID CODING AND FEATURE ANALYSIS 

<130> 04983 . 0220 . OOUSOO 

<160> 4 

<170> Patentin version 3.0 

<210> 1 

<211> 2165 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> unsure 

<222> (1) . . . (2165) 

<223> Unsure at all n locations^ 

<220> 

<223> Ecotype Landsberg, genomic ^DNA 
<400> 1 

tactcaaaaa tatattccat gcttaattag gcWgattcg cggtgacgat gcaccaagag 
cggtttttcc gagcattgta ggccgtcctc gcckcaccgg tgtgatggtt gggatgggac 
aaaaggatgc ttatgttgga gacgaggctc aatcLaacg tggtatcttg actctgaagt 
acccaattga gcatggaatt gttaataatt gggatdacat ggagaagatt tggcatcaca 
ctttctacaa tgagcttcgt gttgcccctg aagaac^cc ggttctcttg accgaagctc 
ctctcaatcc gaaagctaac cgtgagaaga tgactca^t catgtttgag acattcaata 
ctcctgctat gtatgttgcc attcaagctg ttctctcac\ ctatgccagt ggccgtacta 
ctggtcagta cattactaca ttctttttat accgtttggt\ tgaaataaaa ttcggtttgg 
ttcgattcga gtttgctctc attattttta ttttgttggt Lggtattgt tttggactcc 
ggagatggtg tgagccacac ggtaccaatc tacgagggtt aegcacttcc acacgcaatc 
ctgcgtcttg atcttgcagg tcgtgaccta accgaccacc ttaVgaaaat cctgacagag 
cgtggttact ctttcaccac aactgctgag cgtgagattg ttagLacat gaaggagaag 
ctctcttaca ttgccttgga ctttgaacaa gagctcgaga cttccakac aagctcatcc 
gttgagaaga gcttcgagct gccagacggt caagtgatca ccatcggg^ agagcgtttc 
cgatgccctg aagttctgtt tcagccatcg atgatcggaa tggaaaatccVggaattcat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
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gaaactactt 


acaactcaat catgaaatgt gatgtggata tcaggaagga tctttatgga 


960 




aacattgtgc ttagtggtgg caccacaatg ttcgatggga ttggtgatag gatgagtaaa 


1020 




gagatcacag 


cgttggctcc aagcagtatg aacatcaaag tggtggctcc accggaaagg 


1080 




aagtacagtg tctggatcgg tggctctatc ttggcttccc tcagtacttt ccagcaggta 


1140 




aattacttac 


tatacttaat acataaagtc tattagtgat ttgatgtata aagtgttaca 


1200 




aaaatgtgtt 


ccaaatttgc agatgtggat tgcgaaagcg gagtatgatg aatctggacc 


1260 




gtcaatcgtc 


cacaggaagt gcttctgatc aaaagtcacc aagtaaaaca agagcggtaa 


1320 




aaattttgat 


atcagttttt caccctgaag ccagttgcta taattactca caacttctct 


1380 




atttgtgttc 


ttttattctt gtccctcgtt gttcatttta atctcttttt tgcaacaaag 


1440 




caacttaaaa 


aaacagagca gtcattaaca gaatgttatt attatatata tgtatacata 


1500 


CO 


ttagtataca 


cccattattt cattaaaaca tttatcatat aaggatagga ttctatacat 


1560 


i y 


cgatatattt 


attttgttga cactattcag cacatgctta tgtcttatct tgttagtata 


1620 




tgtaaccaaa 


gacaaataat agatgctaca aattgttttc tttgaagcaa aaatttcaat 


1680 




cttaaaattg 


tttttttcca ggttacacaa aaaaaacttg tagtttgtaa attttctata 


1740 




caattttggg 


gatctcaaca agaacatgaa cttcaacttc tagtcatatg acgacctgag 


1800 




tctgcgcggc 


tgtgaatctc tttgctgcag taaatgttta caagtggtgt gtaaattggt 


1860 




actgattcaa 


aagctttaag aaatctacac atttcgtgaa attatttagc agacttgata 


1920 




ttaaaaatct 


aggataaaat gactatccaa agacaaatag gactgtttca catgttcccc 


1980 




tgattcttgt 


agctcataac tcatcagcag ttaacttttc tacctcatac acgctcgcaa 


2040 




tncgtttgga 


attatcagct ntaatttttc taattctttg gaaattatta gcagctcgat 


2100 




caaatggggc 


atggcttctt cttctatctg caactcatct aaactttcca tgaagaaaca 


2160 




aagct 




2165 



<210> 2 

<211> 423 

<212> PRT 

<213> Unknown 

<220> 

<223> Describes a predicted protein sequence 
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<220> 

<221> site 

<222> (I)... (423) 

<223> A stop codon is predicted at all XAA locations 
<400> 2 

Xaa Arg Phe Phe Arg Ala Leu Xaa Ala Val Leu Ala Thr Pro Val Xaa 
15 10 15 

Trp Leu Gly Trp Asp Lys Arg Met Leu Met Leu Glu Thr Arg Leu Asn 
20 25 30 

Gin Asn Val Val Ser Xaa Leu Xaa Ser Thr Gin Leu Ser Met Glu Leu 
35 40 45 

Leu He He Gly Met Thr Trp Arg Arg Phe Gly He Thr Leu Ser Thr 
50 55 60 



In Met Ser Phe Val Leu Pro Leu Lys Asn He Arg Xaa Leu Thr Glu Ala 
^0 65 70 75 80 

M 

fU Pro Leu Asn Pro Lys Ala Asn Arg Glu Lys Met Thr Gin He Met Phe 
U 85 90 95 



Glu Thr Phe Asn Thr Pro Ala Met Tyr Val Ala He Gin Ala Val Leu 
100 105 110 

Ser Leu Tyr Ala Ser Gly Arg Thr Thr Gly Gin Tyr He Thr Thr Phe 
115 120 125 

Phe Leu Tyr Arg Xaa Ser Gly Asp Gly Val Ser His Thr Val Pro He 
130 135 140 

Tyr Glu Gly Tyr Ala Leu Pro His Ala He Leu Arg Leu Asp Leu Ala 
145 150 155 160 

Gly Arg Asp Leu Thr Asp His Leu Met Lys He Leu Thr Glu Arg Gly 
165 170 175 

Tyr Ser Phe Thr Thr Thr Ala Glu Arg Glu He Val Arg Asp Met Lys 
180 185 190 

Glu Lys Leu Ser Tyr He Ala Leu Asp Phe Glu Gin Glu Leu Glu Thr 
195 200 205 

Ser Lys Thr Ser Ser Ser Val Glu Lys Ser Phe Glu Leu Pro Asp Gly 
210 215 220 

Gin Val He Thr He Gly Ala Glu Arg Phe Arg Cys Pro Glu Val Leu 
225 230 235 240 

Phe Gin Pro Ser Met He Gly Met Glu Asn Pro Gly He His Glu Thr 
245 250 255 
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Thr Tyr Asn Ser lie Met Lys Cys Asp Val Asp lie Arg Lys Asp Leu 
260 265 270 

Tyr Gly Asn He Val Leu Ser Gly Gly Thr Thr Met Phe Asp Gly He 
275 280 285 

Gly Asp Arg Met Ser Lys Glu He Thr Ala Leu Ala Pro Ser Ser Met 
290 295 300 

Lys lie Lys Val Val Ala Pro Pro Glu Arg Lys Tyr Ser Val Trp He 
305 310 315 320 

Gly Gly Ser He Xaa Val Pro Asn Leu Gin Met Trp He Ala Lys Ala 
325 330 335 

Glu Tyr Xaa Asn Leu Asp Arg Gin Ser Ser Thr Gly Ser Ala Ser Asp 
340 345 350 

Gin Lys Ser Pro Ser Lys Thr Arg Ala Val Lys He Leu Xaa Asn Ser 
355 360 365 

Ser Ala Val Asn Phe Ser Thr Ser Tyr Thr Leu Ala He Arg Leu Glu 
370 375 380 

Leu Ser Ala Leu He Phe Leu He Ser Leu Glu He He Ser Ser Ser 
385 390 395 400 

He Lys Trp Gly Met Ala Ser Ser Ser He Cys Asn Ser Ser Lys Leu 
405 410 415 

Ser Met Lys Lys Gin Ser Xaa 
420 

<210> 3 
<211> 422 
<212> PRT 
<213> Unknown 

<220> 

<223> Describes a predicted protein sequence 
<220> 

<221> site 

<222> (1) . . . (422) 

<223> A stop codon is predicted at all XAA locations 



<400> 3 

Xaa Arg Phe Phe 
1 

Trp Leu Gly Trp 



Arg Ala Leu Xaa 
5 

Asp Lys Arg Met 



Ala Val Leu Ala 
10 

Leu Met Leu Glu 



Thr Pro Val Xaa 
15 

Thr Arg Leu Asn 
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20 25 30 

Gin Asn Val Val Ser Xaa Leu Xaa Ser Thr Gin Leu Ser Met Glu Leu 
35 40 45 

Leu He He Gly Met Thr Trp Arg Arg Phe Gly He Thr Leu Ser Thr 
50 55 60 

Met Ser Phe Val Leu Pro Leu Lys Asn He Arg Xaa Leu Thr Glu Ala 
65 70 75 80 

Pro Leu Asn Pro Lys Ala Asn Arg Glu Lys Met Thr Gin He Met Phe 
85 90 95 

Glu Thr Phe Asn Thr Pro Ala Met Tyr Val Ala He Gin Ala Val Leu 
100 105 110 

Ser Leu Tyr Ala Ser Gly Arg Thr Thr Gly Gin Tyr He Thr Thr Phe 
H5 120 125 

Phe Leu Tyr Arg Xaa Ser Gly Asp Gly Val Ser His Thr Val Pro He 
CO 130 135 140 

1=^ Tyr Glu Gly Tyr Ala Leu Pro His Ala He Leu Arg Leu Asp Leu Ala 
ly 145 150 155 160 

Gly Arg Asp Leu Thr Asp His Leu Met Lys He Leu Thr Glu Arg Gly 
p 165 170 175 



; i s 



Tyr Ser Phe Thr Thr Thr Ala Glu Arg Glu He Val Arg Asp Met Lys 
180 185 190 

Glu Lys Leu Ser Tyr He Ala Leu Asp Phe Glu Gin Glu Leu Glu Thr 
195 200 205 

Ser Lys Thr Ser Ser Ser Val Glu Lys Ser Phe Glu Leu Pro Asp Gly 
210 215 220 

Gin Val He Thr He Gly Ala Glu Arg Phe Arg Cys Pro Glu Val Leu 
225 230 235 240 

Phe Gin Pro Ser Met He Gly Met Glu Asn Pro Gly He His Glu Thr 
245 250 255 

Thr Tyr Asn Ser He Met Lys Cys Asp Val Asp He Arg Lys Asp Leu 
260 265 270 

Tyr Gly Asn He Val Leu Ser Gly Gly Thr Thr Met Phe Asp Gly He 
275 280 285 

Gly Asp Arg Met Ser Lys Glu He Thr Ala Leu Ala Pro Ser Ser Met 
290 295 300 

Lys He Lys Val Val Ala Pro Pro Glu Arg Lys Tyr Ser Val Trp He 
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305 310 315 320 

Gly Gly Ser He Leu Ala Ser Xaa Gin Met Trp He Ala Lys Ala Glu 
325 330 335 

Tyr Xaa Asn Leu Asp Arg Gin Ser Ser Thr Gly Ser Ala Ser Asp Gin 
340 345 350 

Lys Ser Pro Ser Lys Thr Arg Ala Val Lys He Leu Xaa Asn Ser Ser 
355 360 365 

Ala Val Asn Phe Ser Thr Ser Tyr Thr Leu Ala He Arg Leu Glu Leu 
370 375 380 

Ser Ala Leu He Phe Leu He Ser Leu Glu He He Ser Ser Ser He 
385 390 395 400 

«». 

Lys Trp Gly Met Ala Ser Ser Ser He Cys Asn Ser Ser Lys Leu Ser 
]B 405 410 415 

y = 

^.P Met Lys Lys Gin Ser Xaa 

CO 420 

fU 





<210> 


4 


I ! 


<211> 


296 




<212> 


PRT 




<213> 


Arabidopsis thaliana 


iy 


<220> 






<223> 


Ecotype Columbia, describes 


Q 


<400> 


4 



Met Glu Lys He Trp His His Thr Phe Tyr Asn Glu Leu Arg Val Ala 
15 10 15 

Pro Glu Glu His Pro Val Leu Leu Thr Glu Ala Pro Leu Asn Pro Lys 
20 25 30 

Ala Asn Arg Glu Lys Met Thr Gin He Met Phe Glu Thr Phe Asn Thr 
35 40 45 

Pro Ala Met Tyr Val Ala He Gin Ala Val Leu Ser Leu Ala Ser Gly 
50 55 60 

Arg Thr Thr Gly Gly He Val Leu Asp Ser Gly Asp Gly Val Ser His 
65 70 75 80 

Thr Val Pro He Tyr Glu Gly Tyr Ala Leu Pro His Ala He Leu Arg 
85 90 95 

Leu Asp Leu Ala Gly Arg Asp Leu Thr Asp His Leu Met Lys He Leu 
100 105 110 
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Thr Glu Arg Gly Tyr Ser Phe Thr Thr Thr Ala Glu Arg Glu He Val 
115 120 125 

Arg Asp Met Lys Glu Lys Leu Ser Tyr He Ala Leu Asp Phe Glu Gin 
130 135 140 

Glu Leu Glu Thr Ser Lys Thr Ser Ser Ser Val Glu Lys Ser Phe Glu 
145 150 155 160 

Leu Pro Asp Gly Gin Val He Thr He Gly Ala Glu Arg Phe Arg Cys 
165 170 175 

Pro Glu Val Leu Phe Gin Pro Ser Met He Gly Met Glu Asn Pro Gly 
180 185 190 

He His Glu Thr Thr Tyr Asn Ser He Met Lys Cys Asp Val Asp He 
195 200 205 

Arg Lys Asp Leu Tyr Gly Asn He Val Leu Ser Gly Gly Thr Thr Met 
210 215 220 

Phe Gly Gly He Gly Asp Arg Met Ser Lys Glu He Thr Ala Leu Ala 
225 230 235 240 

Pro Ser Ser Met Lys He Lys Val Val Ala Pro Pro Glu Arg Lys Tyr 
245 250 255 

Ser Val Trp He Gly Gly Ser He Leu Ala Ser Leu Ser Thr Phe Gin 
260 265 270 

Gin Met Gin Met Trp He Ala Lys Ala Glu Tyr Asp Glu Ser Gly Pro 
275 280 285 



Ser He Val His Arg Lys Cys Phe 
290 295 
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